Language Technology
Malayalam Text Summarisation
Automatic Text summarisation is one of the most challenging and interesting problems in the field of Natural Language Processing (NLP). It is a process of generating a concise and meaningful summary of text from multiple text resources such as books, news articles, blog posts, research papers, emails, and tweets. There are two main types of how to summarise text in NLP (1) Extraction-based summarisation and (2) Abstraction-based summarisation.
The extractive text summarisation technique involves pulling keyphrases from
the source document and combining them to make a summary. Where as the abstraction technique entails paraphrasing and shortening parts of the source document. When abstraction is applied for text summarisation in deep learning problems, it can overcome the grammar inconsistencies of the extractive method. The abstractive text summarisation algorithms create new phrases and sentences that relay the most useful information from the original text just like humans do.
With such a big amount of data circulating in the digital space, there is need to develop machine learning algorithms that can automatically shorten longer texts and deliver accurate summaries that can fluently pass the intended messages. Furthermore, applying text summarisation reduces reading time, accelerates the process of researching for information, and increases the amount of information that can fit in an area.
ICFOSS has developed a Text Summarisation solution which helps in summarising the Malayalm texts. The solution help the users in uploading a file or copy pasting the contents to summarise the text. The solution is currently under Proof-of Concept (POC) stage. Further enhancement of the POC to a product is currently under progress.
Gitlab Repository
Express Interest